Robust Voice Mining Techniques for Telephone Conversations
نویسندگان
چکیده
Title of thesis: ROBUST VOICE MINING TECHNIQUES FOR TELEPHONE CONVERSATIONS Sandeep Manocha, Master of Science, 2006 Thesis directed by: Dr. Carol Y. Espy-Wilson Department of Electrical Engineering Voice mining involves speaker detection in a set of multi-speaker files. In published work, training data is used for constructing target speaker models. In this study, a new voice mining scenario was considered, where there is no demarcation between training and testing data and prior target speaker models are absent. Given a database of telephone conversations, the task is to identify conversations having one or more speakers in common. Various approaches including semi-automatic and fully automatic techniques were explored and different scoring strategies were considered. Given the poor audio quality, automatic speaker segmentation is not very effective. A new technique was developed which does not require speaker segmentation by training a multi-speaker model on the entire conversation. This technique is more robust and it outperforms the automatic speaker segmentation approach. On the ENRON database, the EER is 15.98% and 6.25% for at least one and two speakers in common, respectively. ROBUST VOICE MINING TECHNIQUES FOR TELEPHONE CONVERSATIONS
منابع مشابه
A semi-automatic approach for speaker mining of tapped telephone conversations
Speaker mining involves speaker detection in a set of multispeaker files. In previous work on speaker mining, training data is used for constructing target speaker models. In this study, a new speaker mining scenario was considered, where there is no demarcation between training and testing data and prior target speaker models are absent. Given the ENRON database which consists of tapped teleph...
متن کاملIndexing Telephone Conversations by Speakers Using Time-frequency Principal Component Analysis
In this paper, we present an algorithm for the tracking of target speakers in telephone conversations. Speaker tracking consists in retrieving, in an audio recording, segments which have been uttered by a target speaker. We also compare two speech analysis techniques. The first one is the time-frequency principal component analysis. It is a new speech analysis technique based on the extraction ...
متن کاملSpeech Spotter: On-demand Speech Recognition in Human-Human Conversation on the Telephone or in Face-to-Face Situations / Masataka Goto
This paper describes a novel speech-interface function, called “speech spotter”, which enables a user to enter voice commands into a speech recognizer in the midst of natural human-human conversation. In the past, it has been difficult to use automatic speech recognition in human-human conversation since it was not easy to judge, from only microphone input, whether a user was speaking to anothe...
متن کاملSpeech spotter: on-demand speech recognition in human-human conversation on the telephone or in face-to-face situations
This paper describes a novel speech-interface function, called “speech spotter”, which enables a user to enter voice commands into a speech recognizer in the midst of natural human-human conversation. In the past, it has been difficult to use automatic speech recognition in human-human conversation since it was not easy to judge, from only microphone input, whether a user was speaking to anothe...
متن کاملDeveloping Robust VoIP Router Honeypots Using Device Fingerprints
As the telegram was replaced by telephony, so to Voice over IP (VoIP) systems are replacing conventional switched wire telephone devices, these systems rely on Internet connectivity for the transmission of voice conversations. This paper is an outline of ongoing preliminary research into malfeasant VoIP activity on the Internet. 30 years ago PABX systems were compromised by hackers wanting to m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006